Uncertainty Decoding for Noise Robust Automatic Speech Recognition
نویسندگان
چکیده
This report presents uncertainty decoding as a method for robust automatic speech recognition for the Noise Robust Automatic Speech Recognition project funded by Toshiba Research Europe Limited. The effects of noise on speech recognition are reviewed and a general framework for noise robust speech recognition introduced. Common and related noise robustness techniques are described in the context of this framework. Uncertainty decoding is also presented in this framework with the goal of providing fast noise compensation through the propagation of uncertainty to the de-coder. Two forms are discussed, the Joint and SPLICE methods, and evaluated on the medium vocabulary Resource Management corpus at a range of artificially produced noise levels. It was found that the uncertainty decoding algorithms did not meet the performance of a matched system, but were more accurate than the baseline SPLICE enhancement technique and low numbers of CMLLR transforms. Notation A matrix of arbitrary dimensions A T transpose of matrix A |A| determinant of matrix A A-1 inverse of matrix A I identity matrix x scalar quantityˆx estimate of the true value of x x vector of arbitrary dimensions T number of frames in a sequence of observations M number of GMM components in the acoustic model
منابع مشابه
Improving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملUncertainty Decoding with Adaptive Sampling for Noise Robust DNN-Based Acoustic Modeling
Although deep neural network (DNN) based acoustic models have obtained remarkable results, the automatic speech recognition (ASR) performance still remains low in noise and reverberant conditions. To address this issue, a speech enhancement front-end is often used before recognition to reduce noise. However, the front-end cannot fully suppress noise and often introduces artifacts that are limit...
متن کاملIssues with uncertainty decoding for noise robust automatic speech recognition
Interest is growing in a class of robustness algorithms that exploit the notion of uncertainty introduced by environmental noise. The majority of these techniques share the property that the uncertainty of an observation due to noise is propagated to the recogniser, resulting in increased model variances. Using appropriate approximations, efficient implementations may be obtained, with the goal...
متن کاملInvestigations into Uncertainty Decoding Employing a Discrete Feature Space for Noise Robust Automatic Speech Recognition
This paper addresses the robustness of automatic speech recognition to environmental noise. In order to account for reliability of the clean feature estimate we employ the feature posterior density conditioned on observed noisy features to perform uncertainty decoding. We investigate two approaches to estimate the posterior using a discrete feature space, first conditioning only on the current ...
متن کاملA computational auditory scene analysis system for speech segregation and robust speech recognition
A conventional automatic speech recognizer does not perform well in the presence of multiple sound sources, while human listeners are able to segregate and recognize a signal of interest through auditory scene analysis. We present a computational auditory scene analysis system for separating and recognizing target speech in the presence of competing speech or noise. We estimate, in two stages, ...
متن کامل